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Abstract 

Research in second language speech has often focused on listeners’ accent judgment and 
factors that affect their perception. However, the topic of listeners’ application of 
specific sound categories in their own perceptual judgments has not been widely 
investigated. The current study explored how listeners from diverse language 
backgrounds weighed phonetic parameters (i.e., segmental features such as consonants 
and vowels and suprasegmental features such as word stress and sentence stress) 
differently when perceiving non-native speakers’ accented speech. Two hundred forty 
listeners, including American, Vietnamese, and Arabic students, rated Vietnamese 
accented English for intelligibility, comprehensibility, and accentedness. Within this 
group of participants, 112 raters also provided interview responses to questions related 
to their perception of accented speech in general. The results suggest that listeners of 
English perceived degree of accent in fundamentally different ways, depending on 
factors such as their first language and their English instruction backgrounds. Features 
identified in this study can be useful both in the listeners’ global judgments and in the 
communicative situations in which second language learners need to function. 

Keywords : Accent judgment, listener background, segmental, suprasegmental 

Introduction 

Research on the effect of listeners’ first language (LI) background on their perceptual 
judgments has been mixed thus far. Some studies have shown that LI effects are small 
and not consistently observable (e.g., Munro, Derwing, & Morton, 2006), while others 
found significant differences between native speakers (NS) and non-native speakers 
(NNS; e.g., Riney, Takagi, & Inutsuka, 2005). An overall consensus seems to be that 
listeners’ perceptions can be affected by speech properties of speakers or some 
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listeners’ factors such as listeners' language experience (Munro, 2008). The current 
study sought to advance the understanding of factors that affect listeners’ perceptions of 
non-native speech, particularly by investigating the impact of listeners’ own language 
background on their perceptual judgments of accented English speech. 

Previous research in speech perception often focused on global ratings of listeners, but 
not on listeners’ application of specific segmentals or suprasegmentals. The relationship 
between NNSs’ focus of pronunciation instruction (i.e., which segmental and/or 
suprasegmental features pronunciation teachers emphasized explicitly) and their accent 
perception have also been rarely investigated. In addition, with notable exceptions (e.g., 
Ortmeyer & Boyle, 1985; Smith & Bisazza, 1982; Wilcox, 1978), listeners as research 
participants in the past have been either NSs or NNSs as English as a second language 
(ESL) students who have resided in the USA, but not necessarily speakers of English as a 
foreign language (EFL). Consequently, this study investigated how listeners from 
different first language and language learning backgrounds applied phonetic parameters 
differently when perceiving NNSs’ accented speech in English. 

The phonetic parameters in this study refer to segmentals and suprasegmentals. 
Segmentals are minimal units of sound (vowels and consonants) defined in phonetic 
terms (Pennington & Richards, 1986) while suprasegmentals refer to "a vocal effect 
which extends over more than one sound segment in an utterance, such as a pitch, stress 
or juncture pattern" (Crystal, 2003, p. 446). In this study, specific phonetic parameters 
(i.e., word stress, sentence stress, and particular consonants and vowels) in English 
were targeted and altered in such a way that is commonly heard in Vietnamese-accented 
speech. We were interested in understanding how untrained impressionistic judgments 
and the phonetic parameters that influence them differ by the listeners’ background; 
that is, the actual accuracy of listeners’ judgments was not the focus of this study. 

Review of Literature 

Different LI Backgrounds in Listeners' Judgments of Accented Speech 

Various research studies have examined listeners’ accent judgments and factors that 
affect these evaluations. When the contribution of segmental and suprasegmental 
features on listeners’ judgments of accented speech is discussed, one important factor to 
be taken into consideration is the listeners’ backgrounds. Gass and Varonis (1984) 
demonstrated that listeners’ judgments are affected by their language experience. They 
found that listeners’ familiarity with the topic, accent, speaker, and L2 speech were 
strongly correlated with their judgments of intelligibility. One factor that may contribute 
to greater tolerance of listeners from particular language backgrounds for particular 
NNS accents is "the interlanguage speech intelligibility benefit” (Bent & Bradlow, 2003, 
p.1602), which predicts that a NNS listener may be better equipped to interpret specific 
acoustic-phonetic features of an L2 that are matched with his own LI than a different 
LI. Although findings regarding the interlanguage speech intelligibility benefit were 
mixed in Major, Fitzmaurice, Bunta, and Balasubramanian's (2002) study, Spanish 
listeners seemed to benefit from their LI accent, scoring better on a listening 
comprehension test featuring a Spanish speaker than did those of other LI backgrounds. 
Also, Chinese and Japanese listeners were found to understand Spanish accented English 
rather well. Major et al. (2002) suggested that this phenomenon could be due to a 
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similar lack of vowel reduction found among Chinese, Japanese, and Spanish; however, 
other factors such as listener attitudes or the fact that the Spanish speakers had less of 
an accent are possible causes as well. 

In contrasting studies, LI effect on listeners’ judgments has been shown to be minimal, if 
present. Listeners can show moderate to high correlation on global accent judgments 
regardless of LI background (Munro et al., 2006). Few differences were found in the 
ratings of accented speech between NS and NNS listeners (MacKay, Flege, & Imai, 2006). 
In judgments of oral English performance, NS and NNS teachers exhibited similar 
severity patterns (Kim, 2009). Flege (1988) also confirmed that there was no consistent 
pattern found in ratings of perceived foreign accent among different groups of listeners 
(i.e., high-proficient experienced Chinese, low-proficient inexperienced Chinese, and 
experienced American) when listening to English sentences spoken by native speakers 
of English and Chinese. In light of this conflicting research, more empirical studies are 
clearly needed to provide further evidence on these issues (Munro, 2008). 

In related studies, novice NNS raters appeared to be harsher than NSs in judgments of 
accented speech (Kang, 2008; 2012). These fundamental differences between NSs and 
NNSs’ judgments might be based on the application of different phonetic parameters 
(i.e., segmentals vs. suprasegmentals) that raters utilized (Riney et al., 2005). For 
example, in Riney et al.'s (2005) study, two trained phoneticians conducted an auditory 
analysis on L2 sentences that untrained Japanese and American learners judged 
dissimilarly on the construct of accent. They found that the Japanese listeners used 
primarily nonsegmental parameters (specifically intonation, fluency, and speech rate) to 
make perceptual judgments, whereas segmental parameters had a relatively minor role. 
In contrast, the American listeners exhibited the opposite pattern; that is, they applied 
more segmental parameters (/l/ and /r/) but nonsegmentals played a minor role. These 
findings suggest that NS and NNS listeners perceive degree of accent in English in 
fundamentally different ways based on different phonetic parameters. 

Overall, the question regarding how listeners from different LI backgrounds perceive L2 
speech still remains unclear. It is common for different groups of ESL/EFL speakers to 
use English for international communication, and their different perceptions of NNSs’ 
speech continue to affect their interactions (Major, 2007). However, few studies have 
focused on the understanding of listeners’ judgment process through their self-reports. 
The current study investigated how different groups of listeners differed in their 
judgments of accented speech. More specifically, the primary research question 
addressed is: When different groups of listeners (NSs, NNSs from the same LI as the 
speaker, and NNSs from a different LI than the speaker) perceive English accented 
speech, how do phonetic parameters influence their perceptual judgments? 

Phonetic Parameters in Listeners' Judgments 

Phonetic parameters used in the current study refer to specific pronunciation features, 
such as, vowels and consonants for segmentals and lexical and sentence stress for 
suprasegmentals. In particular, specific features from both segmental and 
suprasegmental components of English speech were chosen to be altered in accordance 
with typical Vietnamese-accented speech. 
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Researchers have argued over the roles that segmentals and suprasegmentals play in 
speech perception and intelligibility (Anderson-Hsieh, Johnson, & Koehler, 1992; Flege, 
1981; Jenkins, 2002; Riney et al., 2005). First of all, segmental features can play an 
important role in speech perception. Segmental errors were found to contribute greatly 
to a foreign accent and to have detrimental effects on L2 comprehension (Fayer & 
Krasinski, 1987). Cutler and van Donselaar (2001) posited that although Dutch listeners 
used suprasegmental cues for word recognition in their native language, the 
contribution of segmental features was more important than that of suprasegmental 
features. According to Flege (1981), one of the most apparent features for a foreign 
accent is derived from segmental sound substitutions such as in French-accented I sink 
so or Arabic-accented I put my car in the barking lot. In short excerpts of speech 
produced by NNSs, the frequency of segmental substitutions was found to be highly 
correlated with NS judgments of accentedness (Brennan, Ryan, & Dawson, 1975). 
Although rare, especially among more proficient speakers, the extreme reduction or 
deletion of entire syllables can also interfere greatly with intelligibility (e.g.,"decrating” 
instead of "decorating"; Kang & Moran, 2014). Johansson (1978) found that NSs judged 
mispronounced consonant errors more severely than vowel errors and that 
mispronounced sounds in isolated words contributed more to listeners’ comprehension 
than errors in sentence and text levels. He also compared phonological and grammatical 
errors in L2 speech and found that phonological errors played a significant role in 
listeners' comprehension. 

Jenkins (2002) also asserted that certain pronunciation features are more important to 
intelligibility than others and therefore deserve more pedagogical focus. In her Lingua 
Franca Core (LFC) model, segmentals have primacy over suprasegmentals and 
consonants over vowels in communication between NNSs and NNSs. Moreover, Gimson 
(1970) claimed that accurate production of consonants was more essential to L2 
comprehension than native-like production of vowels, even though Schairer (1992) 
provided the opposite evidence for English-speaking learners of Spanish. In part 
because of these research findings regarding segmental importance, segmental accuracy 
has been stressed in pronunciation textbooks as well as ESL/EFL classrooms. 

The relative impact of segmental errors on listeners’ judgments can be also determined 
by functional load (Brown, 1991; Catford, 1987). For example, interdental fricatives 
carry a low functional load and are thus not high-priority sounds in communication. 
Difficulty producing sounds with a high functional load such as /p/ and /b/ are more 
likely to cause a breakdown in communication than sounds with a low functional load 
(Brown, 1991; Catford, 1987). Likewise, it has been found that as ESL learners progress, 
their high functional load errors (both vowels and consonants) decrease significantly 
although their low functional load errors may not (Kang & Moran, 2014). 

On the other hand, suprasegmental features of speech are associated with stretches that 
are larger than the segment (whether vowel or consonant), in particular pitch, stress, 
intonation, rhythm, or duration (Lehiste, 1970). Many studies have suggested that 
perceived foreign accent, intelligibility, and comprehensibility of NNSs’ English might be 
more greatly impacted by prosodic than segmental factors (Anderson-Hsieh, Johnson & 
Koehler, 1992; Derwing, Munro, & Wiebe, 1998; Hahn, 2004; Field, 2005; Isaacs, 2008; 
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Kang, 2010). Marslen-Wilson (1987) argued the low impact of segmental errors in L2 
comprehension, stating that some phonemic errors might not be likely to disrupt 
communication due to more native-like suprasegmental features. 

Anderson-Hsieh et al. (1992) investigated the relationship between different types of 
pronunciation errors (particularly in prosody and segmentals), syllable structure, and 
NS listeners’ reactions in speech samples taken from the SPEAK Test. Although they 
found a strong correlation between the aforementioned pronunciation errors and global 
foreign accent, the prosodic variable proved to have the strongest effect. Other studies 
have further investigated different aspects of suprasegmental errors which could affect 
L2 perception, such as speech rate (Munro & Derwing, 1995; Issacs, 2008; Kang, 2010), 
voice quality (Munro, Derwing, & Burgess, 2003), several aspects of intonation 
(Wennerstrom, 2000), word/lexical stress (Field, 2005), and sentence (primary or 
nuclear) stress (Hahn, 2004; Kang, 2010). The contributions of these features to 
listeners’ perception have varied widely. 

NNSs from a variety of linguistic backgrounds seem to find the stress patterns of English 
particularly challenging. It is true that English learners often face problems such as 
misplacing word stress and sentence stress (Hahn, 2004), and stress patterns could 
easily cause communication breakdowns in the speech of NNSs (Gallego, 1990). In fact, 
according to Kang’s (2010) study, stress measures best predicted untrained raters’ 
accent ratings. In addition, the syllable structure associated with word stress is a critical 
component of intelligibility rating among ESL teachers (Zielinski, 2008). Therefore, in 
this study, suprasegmental errors mainly focused on word and sentence stress for an 
experimental purpose, in terms of their effects on NSs’ judgments of intelligibility, 
comprehensibility, and accentedness. Segmental errors included vowel and consonant 
errors. 

Methods 

Listeners 

Two hundred and forty university students (80 American, 80 Vietnamese, and 80 
Arabic) participated as listeners and were assigned into three groups. The American 
university students (32 males and 48 females) were enrolled in undergraduate 
university courses at a southwestern university. Their age ranged from 18 to 45 (M = 
27.30 ,SD- 7.62). The Vietnamese listeners (17 males and 63 females) were first-year 
university students from the English Department at a centrally located foreign language 
university in Vietnam. Their English proficiency proved to be upper intermediate with 
their age ranging from 18 to 20 (M = 18.25, SD = .65). Although these students had not 
taken the Test of English as a Foreign Language (TOEFL) or the International English 
Language Test System (IELTS), they had passed the English National Examination in 
order to be accepted into the university. This corresponds approximately to the B1 
(CEFR) and/or a score of 4.5-6.0 on the IELTS. The Arabic students (27 males and 53 
females) were upper intermediate and advanced ESL students from an intensive English 
program (IEP) at a southwestern university in the United States with an age range of 18 
to 25 (M= 18.85, SD - .65). Proficiency levels were determined by the IEP's placement 
and achievement tests. The mean length of their U.S. residence was 5.6 months. Among 
the 240 students, 112 participated in short interviews (80 Vietnamese, 19 American, 
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and 13 Arabic students) after their speech ratings. Participants’ responses to their 
background survey indicated that the American and Arabic listeners were not familiar 
with Vietnamese English L2 accent. All of the participants reported having normal 
hearing. All procedures were in accordance with the Institutional Review Board at the 
research university. 

Speech Stimuli 

Speech stimuli were prepared from several stages of the screening process after 
adopting methods from various sources (e.g., Gass &Varonis, 1994; Hahn, 2004; Munro 
& Derwing, 1995). Ten Vietnamese speakers (5 males and 5 females) who were highly 
proficient in English (TOEFL scores above 100 out of 120) were initially recruited; the 
high TOEFL score helped to ensure that speakers would make few, if any, unintended 
pronunciation errors. They were graduate students in the USA aged from 26 to 34. They 
were asked to read 40 English sentences which consisted of 20 with segmental 
pronunciation errors and 20 with suprasegmental errors common for Vietnamese 
speakers. In particular, they were asked to mispronounce highlighted sounds of words 
(vowels and consonants) in given sentences and to misplace stress in words and 
sentences according to guidelines provided by the authors. (See Speech Materials in the 
following section and the Appendix.) 

Once the speech stimuli (400 sentences) made by 10 Vietnamese speakers were 
collected, the study recruited four linguistic experts, two native speakers and two non¬ 
native speakers of English (one Vietnamese LI and one Korean LI) who had substantial 
linguistic/phonetic training as well as extensive experience in teaching ESL students. 
The linguistic experts were asked to test each of the sentences for its intended 
appropriateness and the accuracy (or inaccuracy) of the pronunciation. That is, while 
listening to speech files, they were asked to compare the scripts which had accurate 
sentences, to focus on words and sounds marked for the intended errors, and to 
determine whether or not the sentences included the intended errors properly made by 
the speakers. Lexical stress errors were verified by the location of stressed syllables and 
sentence stress by the placement of prominence on content words. Each sentence 
included two or three intended errors included. The experts were allowed to listen to 
the stimuli multiple times. They then selected the sentences which contained errors 
suitably made for the purpose of this study. Among 400 sentences, the process of this 
stimuli screening yielded 29 sentences (6 sentences with consonant errors, 4 with vowel 
errors, 8 with word stress errors, and 11 with sentence stress errors), all of which were 
agreed upon by all four experts for the precision of errors. In order to maintain the unity 
of the distribution for each phonetic category, however, the study chose 16 sentences 
only (i.e., four sentences for each parameter). These sentences did not contain any 
unintended pronunciation errors. 

The speech stimuli were further tested by four additional listeners for their coherence 
to target objectives before they were played for primary ratings. The raters were two 
graduate and two undergraduate American students who did not have any linguistic 
training background. 
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Speech Materials 

Pronunciation errors often found in Vietnamese speakers of English are final consonant 
substitutions, final consonant cluster deletions, or mispronunciation of lax/tense vowels 
(Avery & Ehrlich, 1992; Christian, Wolfram, & Hatfield, 1986; Osburne, 1996). 
Suprasegmental errors such as the misplacement of lexical stress or sentence stress are 
not uncommon. The stimuli materials, sentences with problematic sounds expected for 
Vietnamese speakers of English, were prepared after consulting Avery and Erlich 
(1992), Celce-Murcia, Brinton, and Goodwin (2010), Christian et al. (1986), and Morley 
(1992). The selected errors were further confirmed through personal contact of the 
second author with current Vietnamese teachers and students in Vietnam. Although 
characteristics of Vietnamese phonology may vary among regions of the country 
(Northern, Central, and Southern; Hwa-Froelich, Hodson, & Edwards, 2002), the difficult 
sounds (e.g., final consonants) chosen were mainly for Vietnamese speakers of English 
from Central Vietnam. 

Word-final voiceless sounds included /p, t, k, f/, as they are often mispronounced as a 
mixture of /b, d, g, v/ by Vietnamese speakers. Vietnamese speakers do not often release 
those consonants in a final position or substitute those sounds with others (Hwa- 
Froelich, et al., 2002). Targeted word-final consonant clusters were /st, ts, ks, ft/. The 
vowel contrasts in focus were /i:/ vs. /i/, /e/ vs. /c/, /u/ vs. /u/, /o/ vs. /a/. Examples 
of suprasegmental errors were misplaced syllables in words (e.g., They are talking about 
last year’s presidential Election) and misplaced words in sentences, such as stressing 
function words instead of content words (e.g., THERE WAS A terrible car accident ON 
THE corner). Using a headset, recordings were made digitally on a computer. The 
samples varied from 5 to 13 words, with a mean of 8.6 words. Each sample was between 
3.0 and 6.5 seconds long, with a mean length of 4.3 seconds. This speech rate 
(approximately 2 words/second) is at the low end of what previous research has found 
to be indicative of natural speech of native English speakers (i.e., 125-225 
words/minute or 2.08-3.75 words/second; Jones, Berry, & Stevens, 2007). 

Rating Instruments 

The study yielded four outcome measurements for listeners’ perceptual judgments: 
intelligibility, comprehensibility, accentedness, and global judgments. First, listeners 
were asked to listen to the entire set of 16 sentences initially and to rate the global 
comprehensibility and accentedness. These global measures were intended to assess 
listeners’ overall impression on the entire 16 sentences, but not for a specific category. 
The global comprehensibility was measured on a 9-point Likert scale (1 = hard to 
understand ; 9 = easy to understand ) and the global accentedness was assessed with 
another 9-point scale of 1 = has a strong accent and 9 = has no accent. 

Next, comprehensibility and accentedness were individually assessed for each of the 
sentences. The comprehensibility measure also employed a 9-point bipolar scale 
adopting Munro and Derwing’s (1995) and Kang’s (2010) instruments. The listeners 
were asked to listen to the 16 sentences and to assign perceived comprehensibility (1 
= hard to understand ; 9 = easy to understand ) for each sentence. The accentedness rating 
scale (1 -has a strong accent; 9 = has no accent/native-like accent ) was also adopted from 
Kang’s (2010) and Munro and Derwing’s (1995) accent standardness rating scale. 
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After the global measurements, the individual accentedness and comprehensibility 
measurements, intelligibility was measured employing Derwing and Munro’s (1997) 
approach. All the 16 sentences were orthographically transcribed with multiple checks 
for accuracy. The listeners listened to each utterance and then wrote out in standard 
orthography exactly what they heard. For this task, the recording was only played once; 
however, listeners had heard the utterance twice previously during the global 
measurement tasks. Intelligibility was calculated by the percentage of words exactly 
matching the original transcription. Overall intelligibility scores for the four categories 
were calculated by counting the mean of each group of listeners for their correct words 
in sentences. The mean scores for each sentence ranged from 20% to 74%. 

Procedures 

Listener participants completed a language background questionnaire in which 
questions asked listeners’ language learning experience and their familiarity with the 
Vietnamese accent. Listeners were then asked to listen to the entire set of 16 speech 
sentences as a whole and to complete global ratings of comprehensibility and 
accentedness. Two to five meetings for each LI group were arranged in quiet 
classrooms for these rating tasks. Each meeting consisted of 15-40 listeners. 

After a break, listeners were asked to listen to each of the speech samples individually 
for the ratings of comprehensibility and accentedness. They assigned rating scores to 
each of these two rating constructs for each sentence. All speech samples were 
randomly presented. Subsequently, for transcriptions of sentences that served to 
measure intelligibility, listeners were given booklets with numbered spaces. The 
participants were instructed to listen to each utterance and to write out in standard 
orthography exactly what they heard. There were approximately 1.5 minutes of pause 
between sentences. The stimuli were played only once. 

The target sentences were presented to each listener over earphones. We controlled the 
CD by pressing a pause button at the end of each utterance. A new stimulus was not 
presented until all the listeners had finished their rating of the previous one. Each 
meeting lasted approximately 1.5 hours. After the listeners completed their ratings, they 
took part in 5-10 minute interviews answering questions such as "When you listen to 
accented speech, to what pronunciation errors do you react most sensitively (e.g., 
vowels, consonants, word stress, sentence stress, intonation, and rhythm)? Why?" While 
19 out of the 80 American participants and 13 of the 80 Arabic listeners volunteered to 
participate in the interviews, all 80 Vietnamese participants contributed to this 
interview process. The American and Arabic students received course credit for their 
interview participation. For the Vietnamese students, participating in the interview was 
considered part of their English practice activity as well as an extra credit opportunity. 
All responses were recorded and notes were taken when necessary. 

Data Analysis 

The study yielded four dependent variables: global ratings, comprehensibility, 
accentedness, and intelligibility. A total of 16 sentences were divided into four sections: 
consonants, vowels, word stress, and sentence stress. Each section was composed of 
four sentences and each sentence included two to three category-specific errors. For 
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convenience of subsequent analysis, the means of the four sentence judgment scores in 
each section (i.e., the sum of the four sentence scores divided by four) were utilized as 
composite measures (except for the scores of the global ratings). Reliability coefficients 
(Cronbach’s alpha) were .87, .91, and .90 for intelligibility, comprehensibility, and 
accentedness, respectively. Quantitative analysis included one-way ANOVAs, 
correlations, and multiple regressions along with post hoc pair-wise Tukey tests. 
Interview data were used as supportive evidence for the quantitative data results 
(Creswell & Plano Clark, 2007). 

Results 

The study aimed to examine differences in perceptual ratings by three groups of 
listeners (NSs, NNSs from the same LI as the speaker, and NNSs from a different LI than 
the speaker) on Vietnamese accented speech in terms of the degree of perceived 
comprehensibility, accentedness, and intelligibility. High values in ratings indicate 
listeners' positive judgments of the speakers (i.e., high intelligibility, high 
comprehensibility, and native-like accent). Table 1 displays the mean scores of three 
groups of comprehensibility ratings in four different categories of pronunciation errors. 
It also demonstrates how different pronunciation parameters affect listeners’ perceptual 
judgments. 

Table 1. Scores of Three Groups of Comprehensibility Ratings in Different Categories 
of Pronunciation Errors 


Listeners 

Consonants 
Mean (SD) 

Vowels 
Mean (SD) 

Word Stress 
Mean (SD) 

Sentence Stress 
Mean (SD) 

American 

5.00 (1.38) 

2.41 (1.02) 

2.92 (1.30) 

2.14(1.03) 

Vietnamese 

2.50 (1.58) 

3.57 (1.61) 

3.46 (1.30) 

3.67 (1.59) 

Arabic 

3.04 (1.85) 

3.25 (1.38) 

3.74 (1.40) 

4.52 (1.70) 


Note. Comprehensibility measure: 1 = hard to understand; 9 = easy to understand 

Listeners found Vietnamese speech generally hard to understand, as shown in mean 
scores lower than Likert score 5 in all categories, because all sentences had certain 
pronunciation errors. Vietnamese listeners viewed the speech as less comprehensible 
when there were consonant errors in pronunciation. Conversely, American listeners 
reacted more sensitively to vowels and suprasegmental errors, but the consonant errors 
were the least influential factor for their comprehensibility judgments. Arabic listeners 
had trouble with both segmental and suprasegmental errors when listening to 
Vietnamese-accented speech. 

One-way ANOVA results revealed that all comparisons of the three groups of rating 
scores for each of the pronunciation error categories were statistically significant: Fp, 
237) = 60.30, p < .0005, partial eta squared = .34 for consonant errors; Fp, 237) = 16.67, p < 
.0005, partial eta squared = .12 for vowel errors; Fp, 237) = 7.46, p< .001, partial eta 
squared = .06 for word stress errors; and Fp, 237) = 49.52, p < .0005, partial eta squared = 
.29 for sentence stress errors. According to post hoc Tukey test results, all comparisons 
of mean scores of ratings between the American and Vietnamese listeners were 
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statistically significant (p < .0005). The mean difference of rating scores between 
American and Arabic listeners were also significant for all the categories of 
pronunciation errors (p < .001), while a significant difference in ratings between 
Vietnamese and Arabic listeners were only found in the sentence error section (p < 
.001). A similar pattern was found in the results of accentedness ratings. As shown in 
Table 2, all three groups of listeners found the speech samples relatively accented with 
mean scores of 5 or lower in the 9-point Likert scale. The American listeners reacted 
less sensitively to consonant errors than to other pronunciation errors in their accent 
judgments, whereas Vietnamese listeners treated the speech as more accented when 
there were consonant errors. As for Arabic listeners as non-native speakers listening to 
unfamiliar Vietnamese-accented speech, they perceived sentences with lexical stress 
errors as more accented than those with other errors. 

Table 2. Mean Scores of Three Groups of Accentedness Ratings in Different 
Categories of Pronunciation Errors 


Listeners 

Consonants 
Mean (SD) 

Vowels 
Mean (SD) 

Word Stress 
Mean (SD) 

Sentence Stress 
Mean (SD) 

American 

4.77 (1.54) 

2.56 (1.11) 

2.54(1.10) 

2.30(1.03) 

Vietnamese 

2.69 (1.35) 

3.88 (1.80) 

3.71 (1.69) 

4.38 (1.59) 

Arabic 

4.72 (1.87) 

4.22 (1.92) 

2.88 (2.10) 

4.74 (1.70) 


Note. Accentedness measure: l=has a strong accent.... 9= has no accent/native-like accent 

Among the three groups of listeners, statistical differences were found in mean scores of 
accentedness ratings. The results of one-way ANOVAs were F( 2 , 237 ) = 43.34, p < .0005, 
partial eta squared = .27 for consonant errors; F( 2 , 237) = 22.36, p< .0005, partial eta 
squared = .16 for vowel errors; F( 2 , 237 )= 14.64, p< .001, partial eta squared = .11 for 
word stress errors; and F( 2 , 237) = 47.46, p < .0005, partial eta squared = .29 for sentence 
stress errors. For each of the parameters, ratings of American listeners were statistically 
different from those of the Vietnamese (p < .0005). When it comes to accent ratings 
between American and Arabic listeners, significant differences were found in the 
categories of vowel and word stress errors. That is, the U.S. listeners found the 
Vietnamese speech slightly more accented than the Vietnamese listeners, when speech 
had pronunciation problems with vowels and word stress. Due to Vietnamese raters’ 
sensitivity to consonant errors, accent ratings of Vietnamese listeners were significantly 
lower than those of Arabic listeners when sentences had consonant problems (p < .001). 
Intelligibility ratings also revealed a similar pattern in terms of how listeners apply their 
pronunciation parameters for their perceptual judgments. Table 3 shows mean scores of 
three groups of intelligibility ratings in different categories of pronunciation errors. 
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Table 3. Mean Scores of Three Groups of Intelligibility Ratings in Different 
Categories of Pronunciation Errors 


Listeners 

Consonants 

Mean (%) (SD %) 

Vowels 

Mean (%) (SD%) 

Word Stress 
Mean (%) 
(SD%) 

Sentence 

Stress 

Mean (%) 

(SD%) 

American 

82(12) 

63 (11) 

45 (12) 

29 (11) 

Vietnamese 

9 (10) 

25 (11) 

23 (11) 

28 (10) 

Arabic 

25 (12) 

40 (14) 

27 (10) 

13 (10) 


Note. Intelligibility scores: the percentage of words exactly matching the original transcription 

Intelligibility scores appeared generally lower in NNS listeners (Arabic and Vietnamese 
listeners) compared to those in American listeners, perhaps due to NNSs’ command of 
the English language itself. Listeners were required to transcribe the entire sentence 
after listening to the stimuli only once during that task. It is possible that transcribing 
might not have been an easy task for NNS participants in this study. Notwithstanding 
this proficiency issue, there was a noticeable contrast found between ratings of 
American listeners and Vietnamese listeners in terms of their reaction to pronunciation 
errors. When the speech had consonant errors, sentences were transcribed up to 82% 
correctly by American listeners, but only 9% by Vietnamese listeners. American 
listeners' intelligibility scores decreased with speech which had suprasegmental errors. 
These suprasegmental errors affected Arabic listeners in a similar manner. In contrast, 
Vietnamese intelligibility scores increased with such errors. 

One-way ANOVA results showed statistically significant differences in the intelligibility 
ratings by the three groups of listeners: F( 2 , 237 ) = 63.12, p < .0005, partial eta squared = 
.34 for consonant errors; Fp, 237 ) = 23.67, p < .0005, partial eta squared = .16 for vowel 
errors; F (2,237) = 45.41, p < .001, partial eta squared = .27 for word stress errors; and Fp, 
237) = 59.52, p < .0005, partial eta squared = .32 for sentence stress errors. The post hoc 
Tukey results indicated that, except for the sentence stress category, the other three 
comparisons of mean scores of ratings between the American and Vietnamese listeners 
were statistically significant (p < .0005). Mean difference of rating scores between 
American and Arabic listeners were also significant for all the categories of 
pronunciation errors (p < .001). Ratings between Vietnamese and Arabic listeners were 
statistically different in all the error sections (p < .001), except the word stress section. 
As described in the Methods section, participants were initially asked to listen to the 
entire speech samples for the global judgments of comprehensibility and accentedness 
before any specific ratings. Global comprehensibility and accent ratings were collected 
to determine whether students’ overall perceptions of the speech differed with their 
individual assessments. Overall, listeners of all the three groups found the speech very 
accented and hard to comprehend (see Table 4). Although statistical significance 
(family-wise F-values) existed in mean comparisons of all those three groups (Fp, 237 ) = 
12.30, p < .0005 for global comprehensibility ratings; and F( 2 , 237) = 12.61, p < .0005 for 
global accentedness ratings), the actual differences in scores were relatively minimal. 
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Nevertheless, Vietnamese listeners, listening to their own Ll-accented English speech, 
tended to be slightly more lenient than other groups of listeners both in ratings of global 
comprehensibility (post hoc Turkey tests, p < .001) and accentedness (post hoc Tukey 
tests, p < .001). 

Table 4. Mean Scores of Three Groups of Global Accentedness and Comprehensibility 
Ratings Errors 


Listeners 

Global Comprehensibility Ratings 

Global Accentedness Ratings 

American 

3.36 (1.73) 

2.61 (1.35) 

Vietnamese 

3.61 (1.89) 

3.51 (2.15) 

Arabic 

2.17 (1.04) 

2.23 (1.29) 


Statistical analysis showed that Arabic listeners (NNSs from the different LI from the 
speaker), who were not familiar with the Vietnamese accented speech, were harsher 
than NSs (American listeners) or NNSs from the same LI as the speaker. Tukey tests 
confirmed that their rating scores for global comprehensibility were significantly lower 
than the other two groups of listeners (p < .0005). Likewise, in terms of global 
accentedness ratings, both American listeners and Arabic listeners found the 
Vietnamese speech more accented than their own Vietnamese listeners. Although no 
significant correlation was found among the three groups of listeners for their 
comprehensibility ratings, American NSs were moderately correlated with Arabic 
listeners (r = .41). 

Furthermore, Vietnamese listeners appeared to be somewhat more distinct from the 
other two groups of listeners with regard to their speech perception and application of 
pronunciation parameters. The results of multiple regression analyses confirmed the 
phenomenon that for global comprehensibility and accentedness ratings, the sentence 
stress error variable was a significant and potent predictor for both American listeners 
(/? = .56 and higher, p < .005) and Arabic listeners (/3 = .47 and higher, p < .005), but for 
Vietnamese listeners the consonant error variable was the strongest predictor of their 
global judgments (/? = .31 and higher, p < .01). 

Interview Responses 

One hundred twelve participants (80 Vietnamese, 19 American, and 13 Arabic students) 
took part in short interviews directly after their speech ratings. The interviews were in 
group format and informal; they lasted between five and ten minutes. The interview 
questions were asked in English for both the American and Arabic LI listeners. For the 
Vietnamese group, the questions were first asked in English and then translated to 
Vietnamese to ensure full comprehension. Each interview session was videotaped by the 
researcher. Listener participants answered questions generally related to their 
perceptual judgments and processes of evaluating accented speech. 

The interviews were the primary means by which the focus of pronunciation instruction 
was ascertained and revealed the connection between listeners’ judgments and explicit 
pronunciation teaching. For example, the Vietnamese strong sensitivity to consonant 
errors was found in these interview reports. In response to a question about 
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pronunciation features that may affect their accent judgments, approximately 90% of 
respondents among 80 participants expressed consonant-related errors and their 
importance in their pronunciation learning and evaluation (i.e., 51% for only 
consonants, 10% for consonants and vowels, 27% for consonants and other features, 
3% for vowels only, and 9% for others). A following comment from one of the 
Vietnamese respondents further supports this pattern: "consonants because my 
teachers teach me a lot of consonant errors compared to other types." In fact, there was 
a clear tendency found in Vietnamese listeners that their judgmental decisions were 
closely intertwined with their current speaking/conversation class curricula. Almost all 
of the respondents identified the link between these two, adding that their English (EFL) 
teachers often emphasized the significance of consonants, followed by vowels, which 
were still limited to segmental features only. 

The influence of teachers’ instruction on their perceptual judgments was also found 
from Arabic listeners as ESL learners at an Intensive English Program in the USA. For 
example, the following comment was made by one of the Arabic listeners: "currently we 
are learning about stress a lot. I know that I have to pay more attention to stress.” Others 
explicitly remarked on the effect that consonant instruction had made on their 
evaluations (note that although the Arabic students as a whole attended more to 
suprasegmental features, this was not necessarily true for each Arabic listener). For 
instance, one Arabic student responded, "When listening to accented speech, I react 
most sensitively to consonant errors because I practice consonants a lot with my teacher 
so I know them and 1 know very quickly who is speaking with consonant errors." 
Although Arabic respondents’ comments varied, most of them commented on their 
current pronunciation curriculum and its influence on their perception. 

An additional 19 U.S. undergraduate listeners also provided various responses ranging 
from ratings grounded in segmentals to those grounded in suprasegmentals. However, 
their comments included mostly features related to vowels and other suprasegmental 
parameters, but not necessarily to consonants. For example, as one participant noted, "I 
think what I notice first in accents are things that make them markedly different from 
my own American accent. For example, generally with Italians, what I notice first are the 
rhythm of speaking and the different syllable stresses...” The U.S. undergraduate 
responses were especially interesting because this group was not receiving 
pronunciation instruction and therefore could not link their judgments to pronunciation 
pedagogy. It is clear from these interview responses that listeners in different groups 
attend to different aspects of pronunciation when they listen to NNS’s speech. 

Discussion 

The purpose of this study was to examine how different groups of untrained listeners 
differ in using phonetic parameters (segmentals vs. suprasegmentals) to make their 
perceptual judgments of accented speech. The study also aimed to offer more empirical 
evidence to support claims about how segmentals and suprasegmentals affect the 
native’s and nonnative’s comprehension of nonnative speech. In this particular study, 
NSs (American) and NNSs from a different LI than the speaker (Arabic) listeners’ 
judgments were somewhat more sensitive to suprasegmental errors such as sentence 
stress errors, whereas NNSs from the same LI as the speaker (Vietnamese) reacted 
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more perceptively to segmental errors (e.g., consonant clusters) when listening to their 
Vietnamese-accented English. These findings suggest that listeners of English perceive 
accented speech in fundamentally different ways, depending on their LI backgrounds 
and the focus of their pronunciation instruction. However, this conclusion should be 
considered with caution; without additional combinations of NS-NNS and NNS-NNS, it is 
not possible to determine that this is a result solely of the relationship between 
speakers' Lis or if it is catalyzed by the specific features of the languages targeted in this 
study. 

Our overall findings are somewhat opposite to those of Riney et al.’s (2005) study, in 
which untrained Japanese listeners used primarily non-segmental parameters to make 
perceptual judgments and untrained American listeners applied segmental parameters 
more. There are a couple of possible explanations for this. First, as suggested by Riney et 
al., it is possible that suprasegmental features "sounded louder" (2005, p. 460) to 
Japanese listeners because the listeners did not make the same segmental distinctions 
that American listeners did. Specifically, many Japanese learners of English (even those 
with advanced proficiency) may not hear the English /r/ versus /l/ distinction in the 
same way that American listeners do (Takagi, 1993, 2002). Thus, this segmental feature 
may have served as a signal of accentedness to the American listeners, but Japanese 
listeners may not have attended to it. Second, the focus of pronunciation instruction was 
not ascertained in Riney et al. (2005). It is possible that the Japanese learners weighted 
suprasegmental parameters greater when evaluating accent because their 
pronunciation instruction had a suprasegmental emphasis. 

The results of multiple regression analyses indicated that sentence stress was the most 
salient predictor of global perceptual judgments for American and Arabic listeners, 
whereas the consonant related variable most significantly predicted their global 
judgment scores when Vietnamese listeners rated the Vietnamese accented-speech. The 
high correlation between overall comprehensibility scores and prosody features has 
been well documented (e.g., Anderson-Hsieh et al., 1992; Field, 2005; Kang, 2010; 
Munro & Derwing, 1995). The results were also in line with Hahn’s (2004) conclusion 
that the sentence stress errors of the NNS utterances made it difficult for native listeners 
to comprehend NNSs’ speech. 

What was not expected, however, is a distinctive pattern found among Vietnamese 
learners of English performing as listeners who evaluated their Vietnamese accented 
speech. Segmental deviance, particularly with consonant errors, affected their speech 
evaluations more adversely than did suprasegmental deviance. Interview responses 
gathered from each of the 80 respondents supported this tendency in that more than 
90% of Vietnamese listeners addressed segment (consonants especially) related issues 
(i.e., consonant features were of their main concern, but not other pronunciation 
characteristics). Interestingly, according to Vietnamese respondents, this judgment 
pattern originated from the current pronunciation curriculum (i.e., mainly segment 
(consonant) focused pronunciation instruction) that they had received in Vietnam. In 
fact, Vietnamese listeners found terms such as lexical stress or sentence stress 
somewhat foreign, as they often seemed to have conceived pronunciation only as vowels 
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and consonants. Therefore, the focus of pronunciation instruction seems to contribute to 
how speech is understood and evaluated. 

Another finding is dissimilarity in global rating judgments among the three groups of 
listeners (NSs, NNSs from the same LI as the speaker, NNSs from the different LI from 
the speaker). As the Vietnamese listeners might have benefited from listening to 
Vietnamese-accented English, their scores for global comprehensibility and 
accentedness were slightly higher than the NSs (American listeners) or other NNSs 
(Arabic listeners). This result concurred with findings of previous studies where the 
Japanese listeners rated the Japanese speakers as easier to understand than the 
Cantonese speakers (Munro et al., 2006; Smith & Bisazza, 1982). Thus, when looking at 
comprehensibility scores, the current finding seems to add additional evidence to 
support an intelligibility benefit (Bent & Bradlow, 2003) for speech produced in their LI 
accent. However, the intelligibility scores do not reflect the Vietnamese listeners’ 
confidence in comprehension. In fact, American LI and even Arabic LI listeners 
surpassed Vietnamese listeners on the intelligibility measure on all but the sentences 
with misplaced stress, in which the Vietnamese listeners outscored the Arabic listeners. 

Factors that affect listeners’ judgments of accented speech have been broadly studied, 
particularly with regard to LI effect or accent familiarity (e.g., Bent & Bradlow, 2003; 
Gass & Varonis, 1984; Kang, in press; Munro, 2008). Findings have been mixed so far, 
however. While some studies found that prior exposure to varieties of accent does 
facilitate speech comprehension (Field, 2003; Gass & Varonis, 1984), others found no 
such effect (e.g., Munro, Derwing, & Morton, 2006). In Munro et al.’s (2006) study, for 
example, the listener groups of different Lis showed moderate to high correlations on 
intelligibility scores and comprehensibility and accentedness ratings, regardless of 
native language background. The current study exactly exhibited such a complexity of 
listeners’ perception. Global accent ratings yielded a relatively moderate correlation (r = 
.41) between NS listeners and NNS Arabic listeners, but the three groups of listeners 
were not significantly correlated in their global comprehensibility ratings. This means 
that listeners’ background (native) language factor did play a considerable role in their 
comprehensibility judgments of accented speech. 

In line with the listener’s LI background, listeners’ native English language status is 
another factor to consider. Findings of previous research on this topic are also 
inconclusive. In some studies (Fayer & Krasinski, 1987; Kang, in press), NNS listeners 
tend to be more severe in their assessments than NS listeners. In others, NS raters are 
harsher than NNS raters (Brown, 1995) or NS and NNS raters exhibit similar severity 
patterns (Kim, 2009). In this study, findings also differed depending on the constructs of 
the ratings. The NNSs that possessed a different LI than the speaker (Arabic listeners) 
who were not familiar with the Vietnamese accented speech were harsher than NSs 
(American listeners) or NNSs from the same LI as the speaker, especially in global 
comprehensibility ratings, but no significant rating difference between U.S. listeners and 
Arabic listeners emerged in global accent ratings. 

The ESL/EFL distinction may play a role here as well, as the Arabic listeners had 
received instruction in an ESL environment while the Vietnamese listeners were EFL 
students. The need for EFL teachers’ pronunciation training has been particularly 
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emphasized (Breitkreutz, Derwing, & Rossiter, 2001; Burgess & Spencer, 2000; 
MacDonald, 2002; Wang & Munro, 2004). Good pronunciation programs taught by 
professionally trained instructors may not be often available, and teachers themselves 
may be confused about what is possible or desirable in pronunciation instruction 
(Derwing & Munro, 2005). However, another urgent issue to address in matters of 
pronunciation instruction is appropriate training in pronunciation pedagogy in EFL 
contexts. In listening to listeners’ voices through this study, teachers’ instructional 
approach in pronunciation could play a critical role in shaping learners’ perception of 
accented speech. Learners’ pronunciation issues might not only be caused by students’, 
but also by teachers’ lack of awareness in functional features of L2 speech and their 
relationship with listeners’ perception. Note that American listeners as NS listeners or 
even Arabic listeners as NNS listeners in the ESL environment who did not share the 
same LI as the speaker in this study reported that they mainly attended to 
suprasegmental features (sentence or word stress) when they listened to accented 
speech. On the contrary, Vietnamese learners of English tended to prioritize segmentals 
(i.e., consonant features only). It is possible that Vietnamese learners of English may 
encounter disadvantages in international communicative situations in which L2 learners 
need to function. A suggestion that emerges from these findings is that in the NS-NNS 
listener research, a distinction needs to be made among NNS listeners: (1) NNSs in an 
ESL setting and (2) NNSs in an EFL setting. 

Conclusion 

Several implications can be drawn from our findings. We saw that listeners’ factors (LI 
background and their language learning experience) could affect their perceptual 
judgments of accented speech. Nevertheless, a question might still remain regarding 
whether or not this background factor plays a more important role in an EFL context 
rather than in an ESL context due to different instructional methods. This question can 
be further investigated in future research. In addition, findings emphasize the 
importance of teachers’ roles in pronunciation instruction, particularly shaping learners’ 
perceptual judgments of L2 speech. As for individual speech properties, three groups of 
listeners (NSs, NNSs from the same LI as the speaker, and NNSs from a different LI than 
the speaker) applied different phonetic parameters to their perceptual judgments. 
However, some correlations were found in global judgments (i.e., accentedness) among 
different LI groups, which imply that listeners perhaps attend to different speech 
properties depending on types of speech rating constructs. Further research is called for 
concerning the effect of listening assessment constructs on listeners’ use of phonetic 
parameters. 

The implications of these findings can extend to the argument of Jenkins’ (2002) Lingua 
Franca Core (LFC). According to our findings, the LFC assertion that segmentals trump 
suprasegmentals and consonants trump vowels in NS/NNS communication can be true 
with Vietnamese speakers of English in an EFL environment, but not with Arabic 
speakers of English in an ESL environment. Jenkins’ LFC is known for a model that 
explains communication between NNSs and NNSs. However, the results of this study 
suggest that within NNSs' communication, a more specific categorization of speech 
features may be needed to better understand successful oral communication. In 
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addition, NNS listeners’ language learning background should be taken into 
consideration before involving NNSs’ in any speech ratings. 

Finally, the findings of this study provide support to both suprasegmental and 
segmental focus in pronunciation teaching (Anderson-Hsieh et al., 1992; Derwing, 
Munro, & Wiebe, 1998], EFL/ESL teachers should develop their pronunciation 
curriculum considering functional features of L2 speech and their relationship with 
listeners’ judgments of intelligibility, comprehensibility, and accentedness. Reviewing 
comments by Vietnamese learners of English, students in that setting very much desire 
feature-balanced, curriculum-efficient pronunciation instruction. 

Despite the implications listed above, the study can be further improved by overcoming 
a few limitations and expanding the scope of the study in the future. First, it would be 
beneficial to include more LI backgrounds for both speakers and listeners in order to 
lessen the possibility that the results are based on language-specific characteristics. 
Also, the study treated post-secondary Vietnamese listeners as upper-intermediate 
English speakers, but no official English proficiency scores of these Vietnamese listeners 
were collected. Their English proficiency could have affected the results of this study. 
One particularly interesting facet would be to see if there is any difference among 
Vietnamese speakers of English from two different English-spoken contexts: (1) ESL and 
(2) EFL. That is, do listeners from the same LI perceive their Ll-related accented speech 
differently in different contexts? Additionally, grouping participants so that there were 
different focal points of pronunciation instruction within each of the listener groups 
would ensure that language background and emphasis of pronunciation pedagogy were 
not confounding variables. As a final point, phonetic parameters examined in this study 
were somewhat limited to four features (consonants, vowels, lexical stress, and sentence 
stress). A more comprehensive approach including other features of pronunciation (e.g., 
rhythm and intonation) or lexico-grammar is recommended. 

It is important to bear in mind that this study lends just one piece to the puzzle of 
intelligibility of accented speech. Listeners’ comprehension, which is integral to 
communication as well as assessment, was not measured. Because of this limitation, it 
would be difficult to justify any conclusion that a listener’s specific performance was a 
direct result of his or her perception of the phonological features of accented speech. In 
fact, we do not know from the current study how the speaker’s perceived accent would 
affect the listener’s performance, an issue that is crucial for standardized English tests 
such as the TOEFL or the IELTS. Clearly, more work will need to be done in order to 
better understand the connections between pronunciation instruction, phonology, 
perception, and performance. 
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Appendix 


Materials Used for Speech Stimuli 

Consonant 

1.1. What do " tripe " and “ bet ” mean? 

2.2. The roof was broken after the worst storm one week ago. 

3. 3. John told his parents the truth which gave them shocks . 

4. 4. Before he left, he washed all the plates. 

Vowel 

5.1. He put a red sheep on a red ship . 

6. 2. The lady set the pepper on the paper . 

7. 3. There is black soot on a black suit . 

8. 4. Dirk’s duck was on the dock . 

Word Stress 

9.1. They are talking about last year’s presidential Election. 

10. 2. Re CENT ly. there has been an increase in car im PORTS in Vietnam. 

11. 3. She’s a won DER ful MU sician. 

12. 4. We will proBABly go Together. 


Sentence Stress 

13.1. THERE WAS A terrible car accident ON THE corner. 

14. 2. MY landlord collects THE rent payment ON THE FIRST OF THE month. 

15. 3. Patience is THE KEY TO joy; but haste is THE KEY TO sorrow. 

16. 4. He ate A lettuce AND tomato salad FOR lunch. 

Measure of speaker comprehensibility [adapted from Kang, 2010) 

The utterance I just listened ... 

was easy to understand_/_/_/_/_/_/_/_/_was hard to understand 

Measure of speaker accentedness [adapted from Kang, 2010) 

The utterance I just listened ... 

has no accent_/_/_/_/_/_/_/_/_has a strong accent 

Global Judgments [adapted from Kang, 2010) 

Measure of speaker global comprehensibility 

The speaker to whom I just listened ... 

was easy to understand_/_/_/_/_/_/_/_/_was hard to understand 
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Measure of speaker global accentedness 

The speaker to whom I just listened ... 

has no accent_/_/_/_/_/_/_/_/_has a strong accent 
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